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I. Introduction 

Multiuser MIMO (MU-MIMO) technology is expected to play a key role in future wireless cellular 
networks |[T|, |[2|. The MIMO Gaussian Broadcast Channel (BC) model ||3|-|[7| serves as the information 
theoretic foundation for various MU-MIMO downlink schemes. In particular, the MIMO Gaussian BC 
capacity region with zero common message rate was characterized in Q where the optimality of Gaussian 
Dirty-Paper Coding (DPC) is shown, subject to a general convex input covariance constraint. 

In a multi-cell scenario, depending on the level of inter-cell cooperation, we are in the presence 
of a MIMO broadcast and interference channel, which is not yet fully understood in an information 
theoretic sense. A simple and analytically tractable model for the multi-cell system was introduced by 
Wyner |[8J. In both one-dimensional linear cellular array and two-dimensional hexagonal cellular pattern, 
only interference from adjacent cells is considered with a single scaling factor, and the uplink capacity 
is obtained in a closed form in the case of full joint processing of all cells and no fading. Wyner's 
setting was extended in several works |[9|-p3|. Single cell processing and joint two-cell processing was 
investigated by treating the inter-cell interference as Gaussian noise in Q and the flat-fading channel 



case with full joint cell processing was treated in |10|. This model was modified and extended to take 
into account various issues such as soft hand-off and limited inter-cell cooperation due to constrained 



backhaul capacity (see 1 11 1-| 13| and references therein). 

Although the Wyner model captures some fundamental aspects of the multi-cell problem, its rather 
unrealistic assumption for the pathloss makes the system essentially symmetric with respect to any user. 
More realistically, users in different locations of the cellular coverage region are subject to distance- 
dependent pathloss that may have more than 30 dB of dynamic range [ |14J , and therefore they are in 
fundamentally asymmetric conditions. It follows that characterizing the sum-capacity (or achievable sum- 
throughput, under some suboptimal scheme) is rather meaningless from a system performance viewpoint, 
unless some appropriate notion of fairness is taken into account. In fact, if the sum-throughput is the 
only objective, the resulting rate and power optimization under distance-dependent pathloss would lead 
to the solution of serving only the users close to their base station (BS), while leaving the users at the 
cell edge to starve. 

As a matter of fact, "fairness" is a fundamental aspect in cellular networks. The problem of downlink 



scheduling subject to some fairness criterion has been widely studied (see for example |15|-|17| and 
references therein). The goal of fairness scheduling is to make the system operate at some point of 
its ergodic achievable rate region such that a suitable concave and increasing network utility function 
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is maximized |[T8l. By choosing the shape of the network utility function, a desired fairness criterion 



can be enforced. The framework of stochastic network optimization |16| can be leveraged in order 
to systematically devise scheduling algorithms that perform arbitrarily close to the optimal achievable 
fairness point, even when the explicit computation of the achievable ergodic rate region is hopelessly 
complicated. The fairness operating point is given as the time-averaged rate obtained by applying a 
dynamic scheduling algorithm on a slot-by-slot basis. Hence, its analytical characterization is generally 
very difficult and the system performance is typically evaluated by letting the scheduling algorithm evolve 



in time and computing the time-averaged rates by Monte Carlo simulation p9|-p9[. 

In this paper, we propose an alternative approach based on the "large-system limit." We leverage 



results on large random matrices |30|-|34|, in order to characterize the system achievable rate region in 
the limit where both the number of antennas per BS and the number of users per cell grow to infinity 
with a fixed ratio. Our model encompasses arbitrary user locations and distance-dependent pathloss 
and considers arbitrary inter-cell cooperation clusters, where the BSs in the same cluster operate as a 
distributed antenna array (full cooperation) and inter-cluster interference is treated as Gaussian noise 
(no inter-cluster cooperation). As special cases, we recover conventional cellular systems (no inter-cell 
cooperation) and the case of full cooperation. In the large-system limit, the channel randomness disappears 
and the MU-MIMO system becomes a deterministic network. It follows that the performance of dynamic 
fairness scheduling can be calculated by solving a "static" convex optimization problem. By incorporating 
the large random matrix results into the convex optimization solution, we solve this problem in almost 
closed form (up to the numerical solution of a fixed-point equation). The solution is particularly simple 
when each cooperation cluster satisfies certain symmetry conditions that will be discussed later on. The 
proposed method is much more efficient than Monte Carlo simulation and, somehow surprisingly, it 
provides results that match very closely the performance of finite-dimensional systems, even for very 
small dimension. 

The remainder of this paper is organized as follows. In Section [llj we present the MU-MIMO downlink 
system model with cell cluster cooperation and formulate the fairness scheduling problem. We develop 
the numerical solution for the input covariance maximizing the weighted average sum rate in the large- 



system limit in Section |lll] In Section IV we use these results in order to obtain a semi-analytic method 
to calculate the optimal ergodic fairness rate point in the asymptotic regime. In Section |Vj the asymptotic 
rates are shown in 2-cell linear and 7-cell planar models and are compared with finite-dimensional 
simulation results obtained by the combination of DPC and the actual dynamic scheduling scheme based 



on stochastic optimization. Concluding remarks are presented in Section VI 
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II. Problem setup 

We consider M BSs with antennas each, and KN single- antenna user terminals, distributed in 
the cellular coverage region. Users are divided into K co-located "user groups" of equal size N. Users 
in the same group are statistically equivalent: they experience the same pathloss from all BSs and their 
small-scale fading channel coefficients are independent and identically-distributed (i.i.d.). In practice, it 
is reasonable to assume that co-located users are separated by a sufficient number of wavelength such 
that they undergo i.i.d. small-scale fading, but the wavelength is sufficiently small so that they all have 
essentially the same distance-dependent pathloss. Users in different groups observe generally different 
pathlosses, depending on their relative positions with respect to the BSs. 

We assume a block-fading model where the channel coefficients are constant over time-frequency 
"slots" determined by the channel coherence time and bandwidth, and change according to some well- 
defined ergodic process from slot to slot. In contrast, the distance-dependent pathloss coefficients are 
constant in time. This is representative of a typical situation where the distance between BSs and users 
changes significantly over a time-scale of the order of tens of seconds, while the small-scale fading 



decorrelates completely within a few milliseconds |35|. The slot index shall be denoted by t, but we will 
omit t for notation simplicity whenever possible. We shall make explicit reference to the time slot when 



discussing the dynamic fairness scheduling policy in Section IV 



One channel use of the multi-cell MU-MIMO downlink is described by 

M 

yfc = XI '^rn,kiim,k^m + (1) 



m=l 

IT ^ (tN 



where = [yk,i ■ ■ ■yk,N] £ C denotes the received signal vector for the A;-th user group, am,k 
and Hm denote the the distance-dependent pathloss and a x N channel matrix collecting the 
small-scale channel fading coefficients from the m-th BS to the fc-th user group, respectively, = 
[xm,i ■ ■ ■ Xm.,-yNV ^ '^'^^ signal vector transmitted by the m-th BS, and = [n^ i . . . jv]""" G 
denotes the AWGN at the user receivers in the k-th user group. The elements of and of iim,k are i.i.d. 
~ CM (0,1). We assume a per-BS average power constraint expressed by tr(Cov(xm)) < Pm, where 
Prn > denotes the total transmit power of the m-th BS. 

We assume that the BSs are grouped into cooperation clusters. Each cluster acts effectively as a 
distributed MU-MIMO system, with a distributed transmit antenna array formed by all antennas of all 
BS in the cluster. Each cluster has perfect channel state information for all the users associated with the 
cluster, and has statistical information (i.e., known distributions but not the instantaneous values) relative 
to signals from other clusters. Within these channel state information assumptions, we consider ideal joint 
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processing of all BSs in the same cluster. Inter-Cluster Interference (ICI) is treated as additional Gaussian 
noise. Let L denote the number of cooperation clusters. We define the BS partition {A^i, . . . , TWl} of the 
set {!,..., M} and the corresponding user group partition {/Ci, . . . , /Cl} of the set {!,..., K}, where 
M.£ and /C^ denote the set of BSs and user groups forming the ^-th cooperation cluster. We assume that 
clusters are selfish and use all available transmit power, with no consideration for the ICI that they may 
cause to other clusters. Hence, the ICI plus noise variance at any user terminal in group /c G /C^ is given 
by 



2 



E 



1 

N 



(2) 



m^Me 
= 1 + ^ a^jjPm. 

From the viewpoint of cluster A4i, the system is equivalent to a single-cell MU-MIMO downlink with 
per-group-of-antennas power constraint where each antenna group corresponds to each BS's antennas, 
and with AWGN power at the user receivers given by (|2]). Therefore, from now on, we shall focus on a 
reference cluster (say, i) and simplify our notation. We let B = \Aie\ and A = |/C^| denote the number 
of BS and user groups in the cluster, and enumerate the BS and the user groups forming the cluster as 
m = 1, . . . ,B and k = 1, . . . , A, respectively. Also, we define the modified path coefficients j3m,k "™ ' 
and the cluster channel matrix 

' /3i,iHi,i ••• /3i,aHi,a 

H= : ■-. : . (3) 

/3b,iHb,1 • • • fiB,A^B,A 

Hence, one channel use of the reference cluster downlink is given by 

y = H^x + V (4) 

where y = C"^^, x = <C^'^^ , and v ~ CAA(0,I) (we drop subscript I for notation simplicity). 

It is well-known that the boundary of the capacity region of the MIMO BC (|4]) for fixed channel matrix 
H and given per-group-of-antennas power constraints {Pi, . . . , Pg} can be characterized by the solution 



of a min-max weighted sum-rate problem |36|-|38|. For reasons that will be clear when discussing the 



scheduling policy in Section [IV| we restrict ourselves to the case of identical weights for all statistically 
equivalent users, i.e., for the case that users in the same group have the same weight for their individual 
rates. We let Wk and Pfc(H) = Xli^i denote the weight for user group k and the corresponding 

instantaneous per-user rate, respectively. In this paper, we refer to as "instantaneous" the quantities that 
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depend on the realization of the channel matrix H. Since this changes from slot to slot, instantaneous 
quantities also change accordingly. We let vr denote the permutation that sorts the weights in increasing 
order W-^-^ < • • • < Wj^^ and use the subscript [k : A] to indicate quantities involving user groups from 
TTfc to TTA- In particular, we let Hk-.A = . . . H^^ and Q^^^ = diag (QTr^, • • • , Qtta)' where is 
the A;-th jBN x N slice of H in and where = diag((7fc i, . . . , qk,N) is & N x N non-negative 
definite diagonal matrix. 

The rate point {i?i(H), . . . , Ra{H.)} corresponding to weights {Wi, . . . , Wa} is obtained as solution 
of the max-min problem p6|-p8| 

A 

mill max VTy^,i?„,(H) (5) 
A>o Q>o fri 

for the instantaneous per-user rate of each group 

I](A) + Uk:AQk:A&^..A 



i?.,(H) = llog 



(6) 



51(A) + tlk+l:AQk+l:A^k+i:A 

where 5](A) is a ^BN x ^BN block-diagonal matrix with x constant diagonal blocks Aml-yA?, 
for m = I, . . . , B and the maximization with respect to Q is subject to the trace constraint 



B 



tr(Q) < 



m=l 



The variables A = {Am} are the Lagrange multipliers corresponding to the per-group-of-antennas power 
constraints. The rate Rt^^ (H) in ^ can be interpreted as the instantaneous per-user rate of user group 
TTfc in the dual vector Multiple Access Channel (MAC) with worst-case noise defined by 



r = ^HfcSfc + z (8) 



k=l 



where Cov(sfc) = and z ~ CA/'(0, S(A)). In this "dual MAC" interpretation, group sum-rate 
expression (|6]) corresponds to group-wise successive interference cancellation, where user groups are 
decoded successively in the order of vri, 7r2, . . . , tta, and users in each group are jointly decoded. Also, 
notice that users in group tt^ in general do not achieve individually the rate Rt^^ (H) on every slot. Rather, 
this rate is the aggregate sum-rate of all users in group vr^, normalized by N, i.e., the mean user rate of 
group TTfc for given H. 

Efficient interior-point methods to solve ([5]) are given, for example, in |36|-|38|. Yet, the solution of 
this problem is numerically fairly involved, especially for large dimensions. 

Consistently with the assumption of fixed coefficients {Pm,k} and ergodic block-fading for the small- 
scale fading coefficients {Hm,fc}> the ergodic capacity region of the MU-MIMO downlink channel ^ is 
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given by the set of all achievable average rates, where averaging is with respect to the small-scale fading 
coefficients. In particular, let i?fc(H, Wi, . . . , Wa) denote the k-th user group rate at the solution of 
Then, an inner bound to the ergodic capacity region is given by 

C(Pi,...,PB) = coh U {K:0<Rk,i<E[Rk{U,Wi,...,WA)\, 

Wi,...,Wa>0 

yk = i,...,A,yi = i,...,N} (9) 

where "coh" indicates the closure of the convex hull. The achievability of the above region is clear: 
all users i in group k are statistically equivalent and therefore they can achieve the same ergodic rate. 
Notice that C(Pi, . . . , Pb) is generally an inner bound because of the restriction of the weights in ^ to 
be identical for all users in the same group. We will see later that, for fairness scheduling in the limit of 
N ^ oo, this limitation becomes immaterial. 

At this point we can formulate the fairness scheduling problem. Let g(R) denote a strictly increasing 
and concave network utility of the ergodic user rates. While the channel fading coefficients change 
from time slot to time slot according to some ergodic process, the optimal scheduling policy allocates 
dynamically the transmit powers and the DPC precoding order in order to let the system operate at the 
ergodic rate point solution of: 

maximize g(R) 

subject to RgC(Pi,...,Pb) (10) 



Different fairness criteria can be enforced by choosing appropriately the function g{-) 1 18|. For example. 



proportional fairness |15|, |39|, |40| is obtained by letting g(R) = X]fe jlog^fci and max-min fairness 
is obtained by letting g(R) = minfc j Rk^i- 

We notice here that an analytical characterization of the ergodic rate point R* achieving the optimum 



in (10 1 is in general extremely complicated. However, by applying the general stochastic optimization 



framework of |16| (see also |17|, more specifically targeted to the MU-MIMO downlink), explicit 
scheduling policies can be designed such that the limit of the time-averaged user rates converges to R*. 
The rest of this paper is dedicated to finding an efficient method to directly compute R*, by exploiting 
large random matrix theory and convex optimization. 
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III. Weighted average sum rate maximization 

In this section, we consider the solution of the following preliminary problem. For fixed {Am}, we 
consider the maximization of the weighted average sum-rate maximization 



maximize 



1 

N' 



log 



k=l 

subject to tr(Q) < Q 

where we define Q = Y,m=i ^mPm- Letting = W-„^^ 
in ( ]_]_ I can be written as 

A 

1 



W^,_, with 



(11) 

0, the objective function 



fc— IE 



log 



1 



(12) 



In this form, problem |TTj) is clearly convex, since ^w_\(Q) in (12 1 is a concave function of Q. In 
addition, we can prove the following symmetry result: 



Lemma 1: The optimal Q in (11) allocates equal power to the users in the same group. 



Proof: Denote the utility function in (12i as a function of diagonal entries of Q as f{{qk,i '■ 1 < 
k < A,l < i < N,}). Since users in the same group are statistically equivalent and the function /(•) is 
defined through an expectation with respect to the channel fading coefficients, it follows that /(•) must 
be invariant with respect to permutations of the arguments q^^i, . . . , qk,N- That is, for any k = 1, . . . , A, 
and 1 < i < j < A^, the value of the function is invariant if the arguments q^ i and qf^j are exchanged. 



Suppose that {ql^} is the optimal input power allocation, solution of (11). Then, we have 

/(•••) 9fc,i5 • • • ) •••) = /(••• 5 Qk,j^ • ■ • 5 Qk,i^ • • •) 

, 9k,i + ^k,j 1k,i + 1k,j . 

where the inequality follows from the concavity of /(•) and Jensen's inequality. Under the optimality 
assumption, equality must hold and this implies that '^''•'^'^''•^ is the optimal input power for both users 
i and j in group k. Extending this argument by induction, it follows that the optimal input power must 
be in the form q^ ■ = Qk/N for alH = 1, . . . , A^, for some values Qi, . . . , Qa- ■ 

Using Lemma [T] we restrict the optimization in ( 12 1 to block-diagonal matrices Q with constant 
diagonal blocks Qfc = ^I. The following lemma shows that we can restrict to strictly positive {Am}: 

Lemma 2: The optimal A* for the min-max problem ^ are strictly positive, i.e.. A* > 0. 

Proof: The dual variable Am plays the role of the noise power at the antennas of the m-th BS in 
the dual MAC. Let Gw('^) = ™8'XQ:tr(Q)<Q a(Q) ^'^^ suppose that A^ = for some m. Then, 
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S(A*)| = in (12 1 and G-w{\*) goes to positive infinity, which clearly cannot be the solution of the 



minimization with respect to A in (|5]). Therefore, the optimal is strictly positive for all m = 1, • • • ,B. 



Then we can define Hfc = ^/^(A)H,fc and rewrite the objective function with some abuse of 

notation as 



F^^x(Q^,...,QA) = ^Ak-E log I + J]H,,H;,Q 



k=l 



i=k 



where the trace constraint ([ll| becomes X]fc=i Qk ^ Q- 



(13) 



A. Solution for finite N 



The Lagrangian function of problem (111 is given by 



£.{Qi, .■■,Qa;0 = F^,xiQi, ■■■,QA)-^(j2Q^-Qj 

Using the differentiation rule 9 log |X| = tr(X^^9X), we write the KKT conditions as 



(14) 



dC 



k=l 



tr H 



(15) 



for j = 1, . . . , A, where equality must hold at the optimal point for all j such that Qtt^ > 0. After some 



algebra and the application of the Sherman-Morrison-Woodbury matrix inversion lemma |41 1, the trace 



in ( [15] ) can be rewritten in a more convenient form 

fH 



1 



N 



tr h: 



N 



1 — mmse 



(i) 

k:A 



QlTj 



(16) 



where we let @k:A/j =1 + J2f=k ey^j'^-^e^ntQ'^e ^^'^ where we define mmse^"'|^ as follows: consider 
the observation model 

A 

T^[k:A] = X] ^Tve^e + Z (17) 

e=k 

where Sk,SK+i, ■ ■ ■ ,sa and z are Gaussian independent vectors with i.i.d. components ~ CM{0,1). 
Then, mmse^^ denotes the per-component MMSE for the estimation of Sj from r[fc.yi], for fixed (known) 



matrices H 



7I"fc ) • • • 5 ^-^TTA 



mmse 



Utta- Explicitly, we have 



(j) 

k:A 



1 

N' 
1 

N' 



tr I I - Qtt.H^^ Hj^^H^.Q^^ + Sk:A/j 



;tr 



(18) 
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Using (16 1 in (15 1 and solving for the Lagrange multiplier, we find 

A I 

^ = n E E ^'^(l - IE[mmsei^^]) (19) 



Q 

^ i=\ k=i 



Finally, we arrive at the conditions 



Ei, ELi - Elmmseai) 
for all j such that (^tt^ > where, using the KKT conditions and ([T9jl, we find that for all j such that 

Qttj = 0, the inequality 

j A £ 

^ E {K®k-A/j^^.)] < E E ^^(^ - n<^<^sei%]) (21) 

fc=i ifc=i 

must hold. Eventually, we have proved the following result: 



Theorem 1: The solution Q\, . . . , Q*^ of problem ( 1 1 1 is given as follows. For all j for which (21 1 is 
satisfied, then Q*^ = 0. Otherwise, the positive Q*^ satisfy (20 1. ■ 

In finite dimension, an iterative algorithm that provably converges to the solution can be obtained as a 
simple modification of [33, Algorithm 1]. The amount of calculation is tremendous because the average 
MMSE terms must be computed by Monte Carlo simulation. Since our emphasis is on the solution in 
the limit for N — )• cxd, we omit these details and focus on the infinite dimensional case in Section [III-B 



In addition, we have not yet addressed the outer minimization with respect to the Lagrange multipliers 



{Am}- We postpone this issue to Section III-D where we discuss system symmetry conditions for which 
the solution under the per-BS power constraint coincides with the laxer per-cluster sum power constraint. 
In this case, we can let Am = 1 for all m, and no minimization with respect to A is needed. 

B. Limit for N ^ oo 



In this section, we consider problem (111 in the limit for — )• oo, making use of the asymptotic 
random matrix results of |32| . In this regime, the instantaneous per-user rates in (|5]l converge to their 
expected values by well-known convergence results of the empirical distribution of the log-determinants 
in the rate expression (|6]l |j30|, | [3T| . Hence, in the large-system regime, the solution of ( [TT] ) coincides 
with that of for fixed channel pathloss coefficients {f3m,k}, transmit power constraints {Pm}, weights 



{Wk} and Lagrange multipliers {Am}- We will use this fact in Section IV where we will examine 
a general dynamic fairness scheduling policy for the actual (finite dimensional) system, and study its 
performance in the large-system regime. 
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We introduce the normalized row and column indices r and t, taking values in [0, 1), and the aspect 
ratio of the matrix H given by the ratio of the number of columns over the number of rows and given 
hy V = Then, we define the following piecewise constant functions: 

• Q{t): (dual uplink) transmit power profile such that Q{t) = Qt^^ for < i < ^■ 

• Q{r,t): channel gain profile of the matrix H such that G{r,t) = for ^^^^ < r < ^ and 

fc— 1 ^ J. ^ 
A - ^ A- 



• '^k:A{t)' average per-component MMSE profile of the observation model (17 1, such that Tk-A{t) = 
mmsel^^^ for ^ < t < 1. 

• ^k:A{t)- signal-to-interference-plus-noise ratio (SINR) profile corresponding to Tfc.^(t) such that 

Tk:Ait) = l/Tk:Ait) - 1. 

In the limit of large N, these functions satisfy equations given by the following lemma: 

Lemma 3: As — )• oo, for each k = 1, . . . , A, the SINR functions rfc.^(t) satisfy the fixed-point 
equation 

•'^ ^ + ^J{k^i)/A i+r,..A{T) 
Also, the asymptotic Tk:A{t) is given in terms of the asymptotic Tk:A{t) as Tk:A{t) = 1/(1 + Tk:A{'t))- 

Proof: We apply 11321, Lemma 1] to the matrix I + H^r, H^^ Qtt^ where Hk-.A = [Htt^, • • ■ ,'H.tta] 



has independent non-identic ally distributed components. The variance profile in |32 Lemma 1] is defined 
as the limit of the variance of the elements of the matrix H^:^, multiplied by the number of rows, 'yBN. 
With our normalization, the elements of each (m, i)-th block of Hk-.A of size 7A^ x N have variance 
'"•'^ — Therefore, the variance profile for the application of [32, Lemma 1] is given by ^BQ{r,t), 



where G{r,t) is the piecewise constant function defined above. Eventually, we arrive at (22 1. ■ 
Since all functions involved in Lemma [3] are piecewise constant (although the lemma applies in more 
generality), we can give a more explicit expression directly in terms of the discrete set of values of these 
functions. Replacing Tk:A{t) = T^.''^ for all < t < 4 with j > A; in (22) and solving for the integrals 



of piecewise constant functions, we obtain 

r 



U) _ V- P jBg{r,t)Q{t) dr 



k:A— / . 

m=l 



il_L„V^ 'yBg{r,T)Q(r) dr 

A k:A 

B (n2 



Combining ([23]) with the already mentioned modification of the iterative algorithm of |f33l Algorithm 1], 



we obtain a procedure to compute the maximum weighted average sum rate of problem (111, for fixed 
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weights {Wk} and Lagrange multipliers {Am,}- This is summarized by Algorithm [T] below (for notation 
simplicity, the algorithm is written assuming Tr^ = /c for all /c = 1, . . . , ^. We can always reduce to this 
case after a simple reordering of the weights). 

Algorithm 1 Input power optimization for weighted average sum rate maximization 

1) Initialize Qk{0) = Q/A for all k = 1, A. 

2) For i = 0,1,2, .. ., iterate until the following solution settles: 

Q,(^ + 1) = (?-PH^^^^^^^^%^, (24) 

for j = 1, . . . , A, where T^^^ii) = 1/(1 + r^j^^j^{i)), and r^^]^(i) is obtained as the solution of the 
system of fixed point equations p3|), also obtained by iteration, for powers = Qk{i)^ Vfc. 



3) Denote by r^''^(oo), T^^^(oo) and by Qj{oo) the fixed points reached by the iteration at step 2). 



Q J2 AA'rE!,(oo) < E (i - ^k-A^)) (25) 



If the condition 

k=l 1=1 k=l 

is satisfied for all j such that Qj{oo) = 0, then stop. Otherwise, go back to the initialization step, 
set Qj{0) = for j corresponding to the lowest value of X]fc=i ^^^^■'^(oo), and repeat steps 2) 
and 3) starting from the new initial condition. 



C. Computation of the asymptotic rates 

After the powers Qfc = Qfc(oo) have been obtained from Algorithm [TJ it remains to compute the 
corresponding average per-user rates. The average rate of users in group k is given by 



N 



log 



i=k 



N 



-E 



log 



1+ E H.,<Q. 



=k+l 



(26) 



given in |34|. Adapting |34 Result 1] to our case, we obtain 

A 

i + Eh.,h^,Q, 



lim —E 

7V-5>oo 



log 



In the limit for — )• oo, we can use the asymptotic analytical expression for the mutual information 

ve obtain 

E log (l + iQl, E (PkJ^'rn)u^ 
£=k \ m=l J 

+ 7 E log f 1 + 

m=l \ e=k J 

A B 



i=k 



(27) 



l=k m=l 
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where for each k = 1, . . . , A, {um : m = 1, . . . , B} and {ve : 
the system of fixed point equations 

\ i=k J 



Ve 



1 + 7 J]Q;,(/3^,../Am)n„ 



m=l 



= k, . . . , A} are the unique solutions to 
m = 1, . . . , B, 

1 

, i = k,...,A. (28) 



The proof follows from p4| based on the Girko's theorem |31 1 (see also |30|). Although (27 1 is not in a 



closed form, {um} and {vi} in (28 1 can be solved by fixed point iterations with A + B variables. These 
converge very quickly to the solution to any desired degree of numerical accuracy. 



D. System symmetry 



So far we have considered the solution of the maximization in (111 for fixed {Am}. However, we are 



interested in the solution of ([5]) including the per-BS power constraint, that requires minimization with 
respect to {Am}- In finite dimension and for fixed channel matrix, the min-max problem can be solved 



by the subgradient-based iterative method of |j37J or the infeasible-start Newton method of p8| , | |42| . A 
direct application of these algorithms to the large system limit requires asymptotic expressions for the 
subgradient with respect to {Am} or the KKT matrix, respectively. These quantities contain the second 
order derivatives of the Lagrangian function with respect to {Qk} and {Am}, which do not appear to be 
amenable for easily computable asymptotic limits. 

A general method for the minimization with respect to {Am} can be obtained as follows. Let Gw('^) 



denote the solution of ([TTj). This is a convex function of A and the minimizing A* must have strictly 

9Gv 



positive components by Lemma 2 Therefore, at the optimal point we have 



A=A* 



for all 



m = 1, • • • , -B. It follows that the solution can be approached by gradient descent iterations where the 
gradient can be estimated by numerical differentiation | [43| . Let em be a 7i3A^-length vector for which 
the elements (m — 1)7 + !,••• , m^yN are e for some e > and the other elements are zero. Then the 
approximation for the partial derivative of G^{X) with respect to Am is given by ( A+e,„ ) -^Cw ( A- e„ ) 
with O(e^) error term |43|. Both G^^{\ + Cm) and Gw(-^ — ^m) are computed by Algorithm [T] 



From the above argument it follows that the general case where minimization with respect to {Am} is 
required does not present any conceptual difficulty beyond the fact that it may be numerically cumbersome. 
Of course, a simple upper bound consists of relaxing the per-BS power constraint to a sum-power 
constraint in the reference cluster. Notice that the solution and the value of the objective function is 

1 



invariant to a common scaling of the Lagrange multipliers. Therefore, we can assume ^ Ylim=i 
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without loss of generality. Letting Am = 1 for all m yields the laxer sum-power constraint Ylik < 
^ ^tot> where Ptot denotes the total transmit power of the cluster. This choice yields an upper- 
bound to the capacity region of the cluster (under the constraint of treating ICI as noise) and therefore 
also provides an upper-bound to the whole system achievable region under the assumption that all BSs 
transmit at their maximum power. 

Next, we present a system symmetry condition under which the sum-power and the per-BS power 
solutions coincide. Assume the same BS power constraint Pm = P for all m = 1,...,B. Then, let 
A' = A/B assuming that B divides A. In particular, this is true when we have the same number of user 
groups in each cell of the cluster. Finally, assume that the B x A matrix of the coefficients P = {Pm.,k} 
can be partitioned into A' submatrices of size B x B such that each submatrix has the property that all 
rows are permutations of the first row, and all columns are permutations of the first column. Since this 
requirement reminds the condition for strongly symmetric discrete memoryless channels, we shall refer to 
these submatrices as "strongly symmetric blocks". To fix ideas, consider Fig. [1] showing a linear cellular 
layout with 2 cells and K user groups. Let K = 8 and assume distance-dependent pathloss coefficients 
yielding the matrix 

abbafedc 
fedcabba 

for some positive numbers a, b, c, d, e, /. We notice that this matrix can be decomposed into the A' = A 
strongly symmetric blocks 



f3 



a c 
c a 



a f be b d 

fa e b d b 

satisfying the above assumption. 

When these symmetry condition hold, the user groups corresponding to the same strongly symmetric 
block (e.g., user groups pairs (1, 5), (2, 6), (3, 7) and (4, 8) in the example) are statistically equivalent, in 
the sense that they see exactly the same landscape of channel coefficients from all the BSs forming the 
cluster. In this case, as it will be clear in Section |IV| we can restrict the weighted sum-rate maximization 
in Q, ( [TT] ) to the case where the weights Wk are identical for all user groups in the same strongly 
symmetric block. Without loss of generality, let's enumerate the user groups such that the fe-th symmetric 
block contains user groups with indices k = {b — l)B + 1, . . . , bB, with corresponding constant weights 



Wk = W^. Then, the objective function ( 13 1 takes on the form: 

A' 



6=1 





N 



log 



I + S ^{X)'H.(^i,_i^B+l:AQ{b-l)B+l:A^ai-l)B+l:A 



(29) 
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Fig. 1. A linear cellular layout with two cells and K symmetric user groups. 



where A'^ = W^_^_^ - W^' for 6 = 1, . . . , with Wq = 0, and where 

Q = ^diag( Qi, . ^. , Qi , Q2, • -j , Q2 , • • • , Qa,--j,Qa) , 

N N N 

with trace constraint X]fc=i Qk < BP = Ptot- We have the following result: 



Theorem 2: Under the above system symmetry conditions, the minimization in the min-max problem 
a the limit of — )• o 
Proof: Let Pj^ = E 



([5]) in the limit of — )• 00 is achieved for Am = 1 for all m = 1, . . . , i?. 

2" 



[H]j j denote the variance of the (i,j)-th element of the channel matrix. 
For any h = 1, . . . , A', the matrix 'H-(jb-i)B+i:A has the property that the empirical distribution of the 

element variances for all rows, i.e., the cumulative distribution functions 

AN 

^ ^ ' ^ j={b-l)BN+l 

are the same, for all row index i = 1 , . . . , jBN. This means that the matrix of the element variances 
{Pj j} corresponding to H(^_x)b+i:^ is row-regular (see definition in [32, Definition 5]). Under the row- 
regularity condition, it follows that il(b-i)B+i:A the limit of — )• c« is statistically equivalent to a 
matrix 11(^^-1) B+i-. A = G{b-i)B+i:AT^li^i)B+i:A^ "^^^^^ G(b_i)B+i.^ is an i.i.d. matrix with zero-mean, 
unit-variance elements, and T(b-i)B+i:A is a non-negative diagonal matrix with asymptotic empirical 
spectral distribution given by limjv_^oo In particular, the distribution of il[b-i)B+i:A is asymp- 

totically unitary left-invariant, that is, for any unitary matrix U independent of il[b-i)B+i:A^ the matrices 
lJ'tl(^b-i)B+i:A arid ^{b-i)B+i:A asymptotically identically distributed. 
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Let n denote a '^BN x ^BN block-permutation matrix, that permutes the B blocks of consecu- 
tive positions of length in the index vector {!,... ,^BN}. Using the above asymptotic statistical 
equivalence, in the limit of large N we can write, for any {Am} and {Qk}, 

X] P + ^ ^i'^)^^{b-l)B+l:AQ{b-l)B+l:A^^b-l)B+l:A^'^ 



Al, 



b=l 

A' 



N 



n 



N 



b=l 
A' 

E 

6=1 



N 



B 



Bl 



n 



I + n"""!] ^{X)'n'H.(^i,_i)B+l:AQ(b-l)B+l-.A^\b-l)B+l:A 



n 



-1 , 



1+ [n'l](A)nj 'Si(^h_i)B+l:AQib-l)B+l:Aii^b-l)B+l:A 



(") Al 



> 



E 

6=1 
A' 

E 

6=1 



N 



'-E 



E 



log 



log 



^ { 'r}'^^'^'^^^^^ 1 ^{b-l)B+l:AQib-l)B+l:AH.^b-l 

\ n J 



I + 'H.(^b-l)B+l:AQ{b-l)B+l:A'H-g,-l)B+l: 



-1)_B+1:A 



(30) 



where (a) follows from the left-unitary invariance, (b) follows from Jensen's inequality and (c) from the 
fact that, without loss of generality, we let X]m=i -^"i = 1- This shows that, for asymptotically large 
N and under the given symmetry conditions of the channel coefficients and rate weights, the worst-case 



Lagrange multipliers for the weighted maximization of the average rates in (111 is Xm = 1- Since for 



— 7- oo the instantaneous rates in (|5]) converge to the average rates in (11), the theorem is proved 



IV. Fairness scheduling 

Downlink opportunistic scheduling is currently used by "high data rate" third-generation cellular 
systems such as EV-DO [39l| and HSDPA pO} . It is expected that in the next generation of systems 
based on MIMO-OFDM, such as IEEE 802.16m (ij and LTE- Advanced (2), such strategies will be 
integrated with the MU-MIMO physical layer. In such systems, each cooperation cluster runs a downlink 
scheduler that computes a set of rate weight coefficients and, at each scheduling time slot t, solves the 
maximization of the instantaneous weighted rate-sum subject to the per-BS power constraint, as in ([5]l. 
The result of this maximization provides the power and rate allocation and the corresponding downlink 
precoder parameters (i.e., the beamforming vectors and the DPC encoding order) to be used in the current 
slot. The scheduler weights are recursively computed such that the time-averaged user rates converge to 



the desired ergodic rate point R*, the solution of (10). 
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The scheduling poUcy can be systematically designed by using the stochastic optimization approach 
of | [T6} , p7| , based on the idea of "virtual queues". Notice that we do not consider exogenous arrivals: 
consistently with the classical information theoretic setting, we assume that an arbitrarily large number 
of information bits are to be transmitted to the users in each cluster (infinitely backlogged system). The 
virtual queues defined here are only a tool to recursively compute the weights of the scheduling policy. In 
order to illustrate the scheduhng mechanism we will denote instantaneous quantities as dependent on the 
slot index t. In short, the policy ensures that the virtual queue of each user {k, i) (i.e., user i in group k) 
is strongly stable (see |16, Definition 3.1]). This implies that the arrival rate A^. j is strictly less than the 
average service rate Rk^i = E[i?fc j(H(t))]. Then, the desired ergodic rate point R* can be approached 
if the virtual queues are fed by virtual arrival processes Ak^i{t) with arrival rates j = E[Afc 
sufficiently close to the desired values R^^- The interesting feature of this approach is that it is possible 
to generate such virtual arrival processes adaptively, even if the values ■ are unknown a priori, and 
may be very difficult to be calculated directly. 

Let Uk,i{t) denote the virtual queue backlog for user i in group k at time slot t, evolving according 
to the stochastic difference equation 

Uk,iit + 1) = \ukAt) - RkA^it))] + AkAi) (31) 

L -I + 

We consider the scheduling policy given as follows: 

• At each time slot t, solve the weighted sum-rate maximization problem 

A N 

maximize Uk,i{'t)Rk,i{il{t)) 

k=l 1=1 

subject to Cov(xm) < Pm (32) 



The virtual queues are updated according to (31 1, where the arrival processes are given by Ak^i{t) 
■, where the vector a* is the solution of the maximization problem: 



N 



maximize ^^(a) - ^ ^ ak,iUk,i{t) 



k=l i=l 

subject to < ak,i < ^max (33) 

for suitable V > and j4jnax > 0. 
The parameters V and Amax determine the accuracy and the rate of convergence of the time-average rates 



to their expected values. It can be shown 1 16|, 1 17 1 that, for fixed sufficiently large parameters vlmax> the 
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gap between the long-time average rates lim(_>oo X]t=o i^A;,i(H(r)) and the optimal ergodic rates ■ 
decrease as 0{1/V) while the expected backlog of the virtual queues increases as 0{V). 

After reviewing the above background on scheduling and stochastic optimization, we are ready to 
make some observations that are instrumental for the performance computation in the large-system limit. 
Due to the statistical equivalence of users in the same group, the ergodic rate points with i?^ j = 
(independent of i) are achievable. In particular, the boundary of the system ergodic capacity region and 
of the region C{Pi, . . . , Pb) in ^ coincide for all rate points meeting this condition. It is meaningful 
to assume that the network utility function ^(R) is invariant with respect to permutations of the rates 
of statistically equivalent users. In fact, all statistically equivalent users should be treated equally in 
the long-term average sense|^For example, the a-fairness utility function proposed in |18] satisfies this 
condition. In this case, it is immediate to show that the function ^(R) is maximized by some rate point 
with equal rates over each user group or, if the symmetry conditions of Theorem |2] hold, over all groups 
in the same strongly symmetric block. Hence, in large-system limit the point R* = {R^.^} solution of 



( 10 1 must satisfy, for all i, 

R. 



lim — E 



log 



i + E 



i=k+l '-'-■^t'-'-Kl^T^t 



where the term on the right-hand side is the average per-user rate given by the solution of ( 1 1 1 for some 



choice of the weights {Wk} and Lagrange multipliers {Am}- It is well-known that, for a deterministic 
network, the dynamic scheduling policy described before coincides with the Lagrangian dual optimization 
with outer subgradient iteration, where the Lagrangian dual variables play the role of the virtual queues 
backlogs in the dynamic setting. In the large-system limit, the channel uncertainty disappears and the MU- 
MIMO system "freezes" to a deterministic limit. Using the large-system limit solution of ( [TT] ) presented 



in Section the solution of the fairness scheduling problem ( [T0| ) can be addressed directly, using 
Lagrangian duality. 



'Here we assume that all users have equal priority. For example, they are all delay-tolerant data users with no particular 
individual priority: users differ only by their location in the cluster, which determines their channel coefficients {/3fe,m}- 
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A. Lagrangian optimization 

We rewrite ( [To] ) using the auxiliary variables r = [ri , . . . , r^] and using the definition of the ergodic 
rate region Q as: 



min max g{r) 

A i",Q,7r 



1 



subject to r^r^ < — E 



log 



tr(Q) < Q, A > 



(34) 



The Lagrange function for ( 34 1 is given by 

A 

C{X,r,Q,7T,W)=g{r)-^W^, 

k=l 
A 



N 



log 



g{r)-J2Wkrk + ^W^ 



k=l 



k=l 



N 



log 




(35) 



/w(r) 



/iw(A,Q,7r) 



where W is the vector of dual variables corresponding to the auxiliary variable constraints (rate con- 
straints). The Lagrange function can be decomposed into a sum of a function of r only, denoted by 
/w(r)> and a function of A, Q and vr only, denoted by h-wiQ^T^)- The Lagrange dual function for the 



problem (35) is given by 



^(W) = min max i2(A, r, Q, vr, W) 

A r,Q,7r 

= max /w(r) + minmax h\\;(X,Cl,Tr) 



(36) 



(a) 



{b) 



and it is obtained by the decoupled maximization in (a) (with respect to r) and the min-max in (b) (with 



respect to A,Q,7r). Notice that problems (a) and (b) correspond to the static forms of (33 1 and (32i, 



respectively, after identifying r with the virtual arrival rates A(t) and W with the virtual queue backlogs 
U(t). Finally, we can solve the dual problem defined as 



min ^(W) 
w>o 



(37) 



Eventually, the solution of (37 1 can be found via inner-outer iterations as follows: 



Inner Problem: For given W, we solve (36 1 with respect to A, r, Q and vr. This can be further 
decomposed into: 
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• Subproblem (a): Since /w(r) is concave in r > 0, the optimal r* readily obtained by imposing the 
KKT conditions. 

• Subproblem (b): Taking the limit of — oo, this problem is solved by Algorithm [T] for fixed A > 0. 
If the system satisfies the symmetry conditions of Theorem [2] hold, then we let = 1 and no 
minimization with respect to is needed. If these conditions do not hold, the outer minimization 
can be solved by the gradient descent method with the approximated gradient. Otherwise, letting 
Am = 1 yields an upper bound on the achievable network utility function, corresponding to the 
relaxation of the per-BS power constraint to the sum-power constraint. 

Outer Problem: the minimization of ^(W) with respect to W > can be obtained by subgradient 
adaptation. Let A*, tt*, Q* and r*(W) denote]^ the solution of the inner problem for fixed W. For any 
W, we have 

^(W') = max /w'(r) + max /iw'(-^*, Q, tt*) 

r Q 

>/w'(r*(W)) + /iw'(A*,Q*,vr*) 

A 

= g{W) + V (T^^ - Wk) {Rim - rim) (38) 



where Rl ( W) denotes the k-th group rate resulting from the solution of the inner problem with weights 
W, which is efficiently calculated by Algorithm [T] in the large-system regime. A subgradient for ^(W) is 
given by the vector with components i?^(W) — r^(W). Eventually, the dual variables W can be updated 
at the n-th outer iteration according to 

Wk{n + 1) = Wk{n) - fi{n) (i?^(W(n)) - r^(W(n))) , V k (39) 

for some step size ^(n) > which can be determined by a efficient algorithm such as the back-tracking 
line search method ||44| or Ellipsoid method |45 1. In the numerical example of Section[V| we use the back- 



tracking line search method. It should be noticed that by setting /i(n) = 1 this subgradient update plays 
the role of the virtual queue update in the dynamic scheduling policy of pT] ). But in this optimization, 
the objective function converges to a single optimal point by the iterations and, by adjusting the step size 
fi{n) with the above methods, the convergence can be attained very fast. 

As an application example of the above general optimization, we focus on the two special cases of 
proportional fairness scheduling (PES) and hard-fairness scheduling (HES), also known as max-min 



^It is useful to explicitly point out the dependence of W only for r*(W), since this appears in the subgradient expression, 
although it is clear that A*,7r* and Q* also in general depend on W. 
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fairness scheduling. 

B. Proportional fairness scheduling 

The network utility function for PFS is given as 

A 

5(r) = J]log(rfc) (40) 

k=l 

In this case, the KKT conditions for the inner subproblem (a) yield the solution 

rl{W) = l/Wk, yk (41) 

(notice that must be positive for all k otherwise the objective function is — oo). As mentioned before, 
the dual variables play the role of the virtual queue backlogs in the dynamic scheduling policy, while 
the auxiliary variables r correspond to the virtual arrival rates. From ( |4T] ), we see that at the n-th outer 
iteration these variables are related by Wk{n) = ^.(-vy(„)) ■ As observed at the beginning of Section 



IV 



the virtual arrival rates of the dynamic scheduling policy are designed in order to be close to the ergodic 
rates R* at the optimal fairness point. It follows that the usual intuition of PFS, according to which the 
scheduler weights are inversely proportional to the long-term average user rates, is recovered. 

C. Hard fairness scheduling 

In case of HFS, the scheduler maximizes the minimum user ergodic rate. The network utility function 
is given by 

g(r) = min r^. (42) 

k=l,...,A 

This objective function is not strictly concave and differentiable everywhere. Therefore, it is convenient 

to rewrite subproblem (a) by introducing an auxiliary variable 7, as follows: 

A 

max 7 - WkVk 

7,r>0 

k=i 

subject to Tfc > 7, V A; (43) 
The solution must satisfy = 7 for all k, leading to 

A 

max (l-VWfc)7. (44) 

7>0 ^ — ' 

k=l 
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Since the original maximization in (34i is bounded while (44) may be unbounded, we must have that 
"^^^i Wk = 1 and 7 must take on some appropriate value that enforces this condition. The subgradient 
iteration for the weights W, using r^(W(n)) = 7*(W(n)), becomes 

Wk{n + l) = Wk{n)-fi{Rl{W{n))-j*{W{n))), V fc (45) 

Summing up the update equations over k = 1, . . . , A and using the conditions that J2k=i ^k{n) = 1 for 
all n, we obtain 

1 ^ 

r^(W(n)) = 7*(W(n)) = - J]i?*(W(n)), Vfc (46) 

V. Numerical Results 

In this section we present some examples of the multi-cell model considered in this paper and compare 
(when possible) the numerical results using the proposed large-system analysis with the results of Monte 
Carlo simulation applied to an actual finite-dimensional system subject to the dynamic fairness scheduling 



policy outlined at the beginning of Section IV 



The examples involve a one-dimensional 2-cell model (M = 2) and a two-dimensional three-sectored 
7-cell model (M = 21). In both cases, the system parameters and pathloss model are based on the mobile 



WiMAX system evaluation specification |14| with cell radius 1.0 km and no shadowing assumption. The 
2-cell model, shown in Fig. [T] considers two one-sided BSs with 7 = 4, located at ±1 km, and K = 8 user 
groups equally spaced between the BSs. We consider the case of full BS cooperation and no cooperation 
with a symmetric partition of users, yielding L = 2 clusters with /Ci = {1, 2, 3, 4} and K,2 = {5, 6, 7, 8}. 

Fig. [2] illustrates the convergence of the utility function and individual group rates under PFS. In Fig. 
2(a)[ the PFS objective functions in the no cooperation and full cooperation cases are shown to converge 



to the respective optimal PFS points. Not surprisingly, the full cooperative system achieves significantly 



higher value of the objective function (sum of the log-rates). In Fig. 2(b) we compare the asymptotic 
rates in the large-system limit with the achievable rates obtained by using Monte Carlo simulation in 
finite dimension. In finite dimension we considered = 1, 2, or 4 and the same parameters of the 
infinite-dimensional case. The channel vectors are randomly generated and change at every t in an i.i.d. 
fashion, and the instantaneous rates are allocated by using the DPC with the water-filling algorithm 
1 36 1 combined with the dynamic scheduling policy pj| outlined in Section IV Remarkably, the finite- 
dimensional simulation produced nearly the same rates for the considered values of and these rates 
also almost overlap with the the large-system asymptotic results even for very small N . Notice that the 
dynamic scheduling policy should provide multi-user diversity gain and in general should achieve higher 
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(b) Individual group rates 
Fig. 2. Proportional fairness scheduling with 7 = 4 and A' = 8 in the 2-cell model. 



rates than the large-system limit, which is not able to exploit the dynamic fluctuations of the small-scale 
fading due to "channel hardening". However, it appears that in the regime where the pathloss is dominant 
over the randomness of the multi-antenna channels and the number of users is not much larger than the 
number of BS antennas, the multi-user diversity gain is negligible and the asymptotic analysis generates 
results very close to the simulations with dynamic scheduling and DPC. 

Fig. |3] shows the convergence behavior of the utility function and individual group under HFS. In the 
HPS case, all the users achieve the same individual rate which is slightly higher than the smallest rate of 
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(b) Individual group rates 
Fig. 3. Hard fairness scheduling with 7 = 4 and K — 8 in the 2-cell model. 



the PFS case. Also, the agreement with of the individual user rates with the finite dimensional simulation 
is remarkable. 

Using the proposed asymptotic analysis, validated in the simple 2-cell model, we can obtain ergodic 
rate distributions for much larger systems, for which a full-scale simulation would be very demanding. 
We considered a two-dimensional cell layout where 7 hexagonal cells form a network and each cell 



consists of three 120-degree sectors. As depicted in Fig. 4(a) three BSs are co-located at the center of 
each cell such that each BS handles one sector in no cooperation case. Each sector is split into the 4 
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(a) 3-sectored cell configuration (b) Wrap-around torus topology 



Fig. 4. Two-dimensional three-sectored 7-cell model. 



diamond-shaped equal-area grids and one user group is placed at the center of each grid. Therefore there 
are total M = 21 BSs and A' = 84 user groups in the network. In addition, we assume a wrap-around 



torus topology as shown in Fig. 4(b) such that each cell is virtually surrounded by the other 6 cells and 



all the cells have the symmetric ICI distribution. The antenna orientation and pattern follows |46| and the 
non-ideal spatial antenna gain pattern (overlapping between sectors in the same cell) generates ICI even 
between sectors in the same cell with no cooperation. This model is relevant for a macro-cell network 
where both the ICI and the effective inter-cell cooperation are due to neighboring cells. Fig. |5] shows the 
user rate distribution under three levels of cooperation, (a) no cooperation (L = 21 single-sector clusters), 
(b) cooperation among the co-located 3 sector BSs (L = 7 clusters formed by three sectors of each cell), 
and (c) full cooperation over 7-cell network (L = 1). From the asymptotic rate results, it is shown that 
in case (b), the cooperation gain over the case (a) is primarily obtained for the users around cell centers, 
while the cooperation gain is attained over the whole cellular coverage area in case (c). 

VI. Conclusions 

We considered the downlink of a multi-cell MU-MIMO cellular system where the pathloss and inter- 
cell interference make the users' channel statistics unequal. In this case, it is important to evaluate the 
system performance subject to some form of fairness. Downlink scheduling that make the system operate 
at a desired point of the long-term average achievable rate region is an important issue, widely studied 



and widely applied in practice |15|, |39|, |40|. This is classically formulated as the maximization of a 
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Fig. 5. Ergodic user rate distribution in the 7-cell model. 
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concave network utility function over the achievable ergodic rate region of the system. We also considered 
an inter-cell cooperation scheme for which groups of cells operates jointly, as a distributed multi-antenna 
transmitter, and have perfect channel state information for all users in their cluster and only statistical 
information on the inter-cluster interference. Under the constraint that inter-cluster interference is treated 
as noise, this model is quite general. 

We focused on the large-system limit where the number of base station antennas and the number 
of users at each location go to infinity with a fixed ratio. In this regime, we presented a semi-analytic 
method for the computation of the optimal fairness rate point, based on a combination of large random 
matrix results and Lagrangian optimization. The proposed method is particularly simple and efficient in 
the case where the system has certain symmetries. Otherwise, we can obtain a simple upper bound by 
relaxing the per-base station power constraint to the per-cluster sum-power constraint. Numerical results 
showed that the rates predicted by the large-system analysis are indeed remarkably close to the rates by 
Monte Carlo simulation of a corresponding finite-dimensional system. Overall, the results of this paper 
are useful in evaluating and optimizing the multi-cell MU-MIMO systems, especially when the system 
dimension and network size are large. 
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